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Abstract 

We give the avoidance indices for all unary patterns with involution. 

Keywords: words avoiding patterns, combinatorics on words, repeti- 
tions, Thue-Morse word 

Consider a non-empty word p over X = {x,g(x)}. Here g(x) is literally 
the string l g{x)\ so that if p = xg(x)x, we say that \p\ = 3. Let T be a 
finite alphabet. We call w G T* a morphic instance (resp., antimorphic 
instance) of p if there is a morphic (resp,, antimorphic) involution g? of T* 
and a non-erasing morphism : S* — > T* such that <fi(g(x)) = <7t(0(x)). In 
the case that p = xg(x)x, a morphic (resp., antimorphic) instance of w would 
be a word ygT{y)y where y G T + and gx is a morphic (resp., antimorphic) 
involution of T*. The morphic (resp., antimorphic) avoidance index 
of p is the size of the smallest alphabet T such that there exists an infinite 
word over T, no factor of which is a morphic (resp., antimorphic) instance of 
p. Denote the morphic (resp., antimorphic) avoidance index of p by A m (p) 
(resp., A a (p)). If the symbol g(x) doesn't appear in p, then A m (p) = A a (p) 
is just the usual avoidance index of p. As is pointed out in [TJ, interchanging 
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x's and g(x)'s in a pattern p does not change the morphic or antimorphic 
avoidance index. If p G £ 3 , the avoidance index of p is given in [Tj: 



A m (p) = A a (p) 



2, p G {xxx,g(x)g(x)g(x)} 
3 otherwise 



Although £ has two elements, it is natural to call words over £ unary 
patterns with involution. The next most complex patterns to consider 
would be over {x,y,g(x),g(y)}, and we would consider them binary pat- 
terns with involution. We will give the avoidance indices for all unary 
patterns with involution. 

The avoidance indices of words x n are known, so we need only consider 
words p for which \p\ x , \p\ g ( x ) > 1- Clearly, A m (xg(x)) = A a (xg(x)) = oo. 
The avoidance indices for patterns of length 3 are known. We will show that 
whenever p G S 4 , A m (p) = A a (p) = 2. Since no word can have avoidance 
index 1, we see that 

{3, p G S 3 — {xxx, g(x)g(x)g(x)} 
oo, p G {x,g(x),xg(x),g(x)x} 
2 otherwise 

The avoidance indices are clearly 2 when xxx is a factor of p. We consider 
words p G S 4 where xxx is not a factor. Interchanging x's and g(x) 7 s if 
necessary, assume that \p\ x > \p\ g ( x )- Since xxx is not to be a factor of p, 
either \p\ g ( x ) = 1 or \p\ g ( x ) = 2. In the first case, our word p is xxg(x)x 
or xg(x)xx. Since avoidance indices are preserved under reversal, we need 
only consider the case p = xxg(x)x here. If \p\ g ( x ) = 2, ignoring reversals, 
we consider xg(x)xg(x), g(x)xxg(x),xxg(x)g(x). For each of these p G S 4 
we will show that both avoidance indices are 2. Simplifying (or abusing, 
if you prefer) our notation, this amounts to constructing an infinite binary 
word with no factor xxg(x)x (xg(x)xg(x) , g(x)xxg(x) , xxg(x)g(x)) where x 
is non-empty and g is a morphic (g is an antimorphic) involution. 



1 Morphic involutions 

Let t be the Thue-Morse sequence /i w (0), where h(0) = 01, h(l) = 10. Write 

t = n£o*i, t*e{o,i}. 

Let w be the infinite word 

w = n~ 2 l*'+ 2 . 
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We see that w is concatenated from blocks of two O's alternated with blocks 
of either two or three l's. 

Lemma 1. Word w has no factor of the form xxg(x)x where x is a non- 
empty word and g(x) is the image of x under a morphic involution of {0, 1}*. 

Proof: Suppose for the sake of getting a contradiction that xxg(x)x is a 
factor of w where x is a non-empty word and g{x) is a morphic involution of 
{0,1}*. 

If |x|o = 0, then x = l m for some m. If g is the identity, this makes 1111 
a factor of w, which is impossible. If g is the complement morphism, then 
m < 2, since g(x) = m is a factor of w. Then, however, xxg(x)x = 1101 or 
11110011, neither of which is a factor of w. If \x\i = 0, then x = or x = 00. 
If g is the identity, this makes 0000 a factor of w, which is impossible. If g 
is the complement morphism, then 0010 or 00001100 is a factor of w neither 
of which is possible. We conclude that |x| , |x|i > I. 

Suppose that g is the complement morphism. Word w has factors g(x)x 
and xx, hence factors Ox, lx. This means that x cannot start 01, 10 or 00, 
since none of 101, 010 or 000 are factors of w. We deduce that x commences 
11. Similarly, x ends 11. Now, however, xx has 1111 as a factor, which is 
impossible. 

Suppose then that g is the identity morphism, so that CCXXX IS db factor of 
w. Let s > be maximal so that s is a prefix of x. Let t > be maximal 
such that 0* is a suffix of x. Since \x\i > 1, x has prefix S 1 and suffix 10*, 
and 10* +s l is a factor of xx, implying t + s = or t + s = 2. 
Case 1: Suppose t + s = 0. If |ar| = 2, write x = V0 2 l q , r,q>\. Then 
xxxx = l r 2 l 9+r 2 l 9+r 2 l 9+r 2 l«, and t contains the overlap (q + r - 2)(q + 
r — 2)(q + r — 2), which is impossible. Thus assume \x\q > 2, and write 
x = r 2i^+2 2 . . . 1 t 3 +2 2 19) r ,q>l,i< j. Then xxxx is 

-|Tq2^;+2 _ _ _ ^t J +2g2^ (3 +rg2^ i +2 _ _ _ ]t 3 ;+2g2 -^g+rg2 _ > _ ]tj +2q2 ^+rg2 -|^+2q2 _ _ _ -^+202-^ 

and t contains the overlap (q + r — 2)tj • • • tj(q + r — 2)tj • • • tj(q + r — 2), 
which is again impossible. 

Case 2: Suppose t + s = 2. If |x|o = 2, write x = s l* 4+2 0', some i. Then 
xxxx = o s l* i+2 2 l' i+2 2 l' l+2 2 l* i+2 0', and t contains the overlap tiUU, which 
is impossible. Thus assume |x| > 2, and write x = s l* i+2 2 • • • l <J+2 0*, 
% < j. Then iJO tXs *jC is 

QS-^ti+2 _ _ _ ^tj+2^2^ti+2 _ _ _ ^tj+2g2^<i+2 _ _ _ j^j +2q2 j^i+2 _ _ _ -^tj+2gt 
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and t contains the overlap U - ■ ■ tjti ■ ■ ■ tjti, which is again impossible. □ 
Let v be the infinite word 

v = n°i oi 2t<+1 . 

We see that v is concatenated from O's alternated with blocks of either one 
or three l's. 

Lemma 2. Word v has no factor of the form g(x)xxg(x) where x is a non- 
empty word and g(x) is the image of x under a morphia involution of {0, 1}*. 

Proof: Suppose for the sake of getting a contradiction that g(x)xxg(x) is a 
factor of v where x is a non-empty word and g(x) is a morphic involution of 
{0,1}*. 

Since 00 is not a factor of v but xx is a factor, > 1. If |x|o = 0, then 
x = l m for some m. If g is the identity, this makes 1111a factor of v, which 
is impossible. If g is the complement morphism, then m — 1, since g(x) = m 
is a factor of v. Then, however, g(x)xxg(x) = 0110, which is not a factor of 
v. We conclude that |x|o, > 1. 

Suppose that g is the complement morphism. If x begins and ends with 
different letters, then one of g(x)x and xg(x) has 00 as a factor, which is 
impossible. Therefore the first and last letters of x are the same. They must 
both be 1; otherwise xx would contain 00. Again 11 cannot be a factor of x; 
otherwise 00 would be a factor of g(x). It follows that x begins with 10 and 
ends with 01. Now, however, xx has the factor 0110, which is impossible. 

Suppose then that g is the identity, so that xxxx is a factor of v. If 
|x| = 1, write x = l q 01 r , some q,r > 0. We must have q + r > 1, since 
jxji > 1. Now xxxx = l«01 r+ «01 r+ «01 r+9 01*. This implies the existence of 
an overlap r+ l~ 1 r+ ^~ l r+q ^ 1 in t, which is impossible. 

Assume then that |x| > 2. Write x = i?01 2 * i+1 • • ■ l 2t ? +1 01 r for some 
i ^ 3 1 some q, r > 0. Then Ob X Ob X has the factor 

l r+<? 01 2<1+1 • ■ ■ 1 2 *j+ 1 oi r+ ' 3 01 2 * 4+1 • ■ • l 2 *3 +1 01 r+<? 



and t contains the overlap 



r + q — 1 r + q — 1 r + q 

2 ^ i ' ' ' ^ 2 ^ i ' ' ' ^ 2~ 



This is impossible. □ 
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Let u be the infinite word 

u = n^ oi* i+2 . 

We see that u is concatenated from O's alternated with blocks of either 3 or 
2 l's. 

Lemma 3. Word u has no factor of the form xxg(x)g(x) or xg(x)xg(x) 
where x is a non-empty word and g(x) is the image of x under a morphic 
involution of {0, 1}*. 

Proof: Suppose for the sake of getting a contradiction that xxg(x)g(x) or 
xg(x)xg(x) is a factor of u where x is a non-empty word and g(x) is a morphic 
involution of {0, 1}*. 

First suppose that g is the complement morphism. Since u contains a 
factor g(x), but no factor 00, word x cannot contain 11 as a factor. Similarly, 
u doesn't contain a factor 010, so that x cannot contain a factor 101. The 
only possibilities for x are then 0, 1, 01 and 10. The resulting values for 
xxg(x)g(x) (resp. xg(x)xg(x)) would be 0011, 1100, 01011010, 10100101 
(resp. 0101, 1010, 01100110, 10011001) which all contain either 00 or 010 
and are thus impossible. 

Suppose then that g is the identity morphism. Thus xxg(x)g(x) = 
xg(x)xg(x) = xxxx. Since 00 is not a factor of u but xx is a factor, \x\i > 1. 
If |x|o = 0, then x = l m for some m, and 1111 is a factor of u. This is 
impossible. It follows that |x|o, |x|i > 1. If \x\o = 1, write x = l 9 01 r , some 
q, r > 0. Then xxxx = l 9 01 r+9 01 r+9 01 r+9 01*. This implies the existence of 
an overlap (r + q — 2)(r + q — 2)(r + q — 2) in t, which is impossible. 

Assume then that |x| > 2. Write x = P01* i+2 • • • l^ +2 01 r for some i < j, 
some q, r > 0. Then xxxx has the factor 

and t contains the overlap 

(r + q- 2)U ■■■t j (r + q- 2)t t ■ ■ ■ tj(r + q - 2). 
This is impossible. □ 
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2 Antimorphic involutions 



Over {0, 1}, there are only two antimorphisms: the reversal x — > x R gener- 
ated by R = and 1 R = 1, and the reverse complement x — > x R . 

Lemma 4. Word w has no factor of the form xxg(x)x where x is a non- 
empty word and g(x) is the image of x under an antimorphic involution of 
{0,1}*. 

Proof: Suppose for the sake of getting a contradiction that xxg(x)x is a 
factor of w where x is a non-empty word and g(x) is an antimorphic involution 
of {0,1}*. 

By Lemma [1] we may assume that g(x) ^ x, since we have shown that 
w has no factor xxxx with x non-empty Similarly, we may assume that 
g(x) 7^ x. These conditions together imply that x is not a palindrome, and 
that x R 7^ x. Suppose, for example, that x is a palindrome. If g is reversal, 
then g(x) = x, which we have forbidden. If g is reverse complement, then 
g(x) = (x R ) = x, again forbidden. Similarly one checks that x R ^ x. To 
continue with our proof, suppose that g is the reverse complement. Since 
w contains a factor g(x), but no factor 000, word x cannot contain 111 as 
a factor. Also, w does not contain 010 or 101 as a factor. It follows that 
x is a factor of (0011)^. Since xg[x) and g[x)x are factors of w, x cannot 
begin or end with 01 or 10. It therefore begins and ends with 00 or 11. The 
length 2 prefix and length 2 suffix of x must differ, since otherwise xx would 
have 0000 or 1111 as a factor. We conclude that x = (0011)" ori= (1100)" 
for some n. But then x is the complement of its reverse, contradicting our 
previous assumption. 

Suppose then that g is the reversal. Since xg(x) and xx are both factors 
of w but 010, 101 are not, x cannot end in 01 or 10. Then x ends in 00 or 
11, and xg(x) contains 0000 or 1111 as a factor. This is impossible. □ 

Lemma 5. Word (0001)^ has no factor of the form xxg(x)g(x) , xg(x)xg(x) 
or g(x)xxg(x) where x is a non-empty word and g(x) is the image of x under 
an antimorphic involution of {0, 1}*. 

Proof: Suppose for the sake of getting a contradiction that xxg(x)g(x), 
xg(x)xg(x) or g(x)xxg(x) is a factor of (0001)^ where x is a non-empty word 
and g(x) is an antimorphic involution of {0, 1}*. 

If g is reversal, then x cannot end in 01 or 10; this would imply 0110 
or 1001 as a factor of xg(x); however these are not factors of (0001)^. It 
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follows that if \x\ > 1 then x ends in 00, since 11 is not a factor of (0001) w . 
Then, however 0000 is a factor of xg(x), which is impossible. We conclude 
that |x| = 1, and xxg(x)g(x), xg(x)xg(x), g(x)xxg(x) G {1111,0000}. This 
is impossible. 

If g is reverse complement, 00 cannot be a factor of x; otherwise 11 is 
a factor of g(x). However, x cannot end in 01 or 10, or xg(x) would have 
0101 or 1010 as a factor. We conclude that \x\ = 1, and xxg(x)g(x) = 
xg(x)xg(x) = g(x)xxg(x) e {0011,0101, 1001}, which are impossible. □ 
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